How do string quartet covers differ from the original tracks?

The corpus I will delve into is Vitamin String Quartet’s covers of popular songs. As their discography is quite large, I cannot analyze all their music, so I will focus on a random selection of 104 tracks. These tracks have been added to a playlist, while their original versions have been added to a seperate playlist. This way, the two comparisons are only of the tracks and their originals.

I chose this corpus because I love string quartet music. I play violin and cello myself and thus am personally attached to the type of instruments used, but I also think having a group of four instruments makes it possible to hear each instrument individually whilst being able to appreciate how they sound together. The layers of the music can each be distinctly heard. Covering popular music gives a creative twist to the original tracks, while showing people that instruments generally associated with classical music are very versatile and can even motivate people to also go and listen to classical music.

The most natural comparison points are the original versions of the covered songs. The most obvious differences will likely be acousticness and instrumentalness. I imagine danceability, loudness and energy could differ as well, but they could be higher in either one. I am curious to see whether fundamental parts of a track like the key and tempo will be altered in the covers.

In using a specific artist for analyzing string quartet covers, there will likely be more coherency between the covers than if I were to take several string quartets. In this sense, it is not entirely representative of all string quartet covers. While the orginal tracks the covers are based on span across several genres, much of it is popular music and therefore it will largely be a comparison to pop music. In this sense, it is not representative of all genres.


Valence, Energy and Loudness in String Quartet Covers compared to their Original Versions


Comparing valence, energy and loudness shows us how having only a string quartet as the performers of the piece changes the mood of a song. The graphs below show us this difference, making it clear that especially the energy of a song is quite different in a string quartet version. The size of the points represents the track’s loudness, and while there is not a very large difference, it also seems most string quartet versions are not as loud. For these graphs, I used a playlist I made of 104 covers by Vitamin String Quartet, and a playlist I made of the original versions of these 104 songs. When hovering over the graphs, you can see the track names and the exact value of valence, energy and loudness per track.

Chromagrams: Are string quartet versions lower pitched?


This comparison of the chromagrams of the original and the string quartet rendition of ‘Wonderwall’ already show that the pitches in both versions are different. As a chromagram captures harmonic and melodic characteristics, you would expect a cover track to be similar. I chose this track from my corpus because the duration of the two versions was so close that I thought they would be easily comparable. From these comparisons, however, it seems that the string quartet version is generally lower pitched than the original. On the next tab, the two will be combined to show their alignment with the dynamic time-warping technique.

Dynamic Time Warping: The alignment of Oasis’s ‘Wonderwall’ versus its String Quartet cover


With Dynamic Time Warping, we can compare the pitches of the two versions in a single visualisation. If the two tracks have the same pitches, but are performed by two different artist, we can see a diagonal line in this visualisation. In this one, there is no line and it is very difficult to see the similarity of pitches in the two versions of ‘Wonderwall’. This leads us to conclude that while one is copying the melodies of the other, they do not align well. Despite this being the clearest option of the different normalisation and distance combinations, the chromagram is still quite unclear.

Now that we have compared string quartet covers to their originals, let’s compare a cover to itself.


These self-similarity matrices compare the chroma (pitches) and timbre (character) of Vitamin String Quartet’s rendition of the Oingo Boingo classic ‘Dead Man’s Party’. The visualisation on the left represents chroma, showing us at which points the same pitches appear in the track. The one on the right shows us the same thing for timbre, which is based on several factors like harmonic structure, frequency and intensity. Timbre gives us the “tone color” or “tone quality” of an instrument, which distinguishes it from other instruments.

In these visualisations, we see several diagonal lines. They are most clear in the one comparing chroma. These lines represent paths that define similar segments. When you listen to the track while looking at the matrix, you can see that the first small diagonal paths are where short melodies repeat, while the longer, very clear ones are a repeat of a longer melody segment. The bright, perpendicular lines show a sudden new section. This is also visible in the track, as this is the point where an entirely new melody starts.

In the timbre matrix, we cannot see too many clear paths. We do, however, see bright lines representing novelty, and a checkerboard pattern which shows us homogeneity. As this is a string quartet cover, the instrumentation is all the same four instruments, and they are all string instruments. This will likely be the cause of the checkerboard pattern. The bright lines could represent the points where the certain instruments either come in, or stop playing.

Will a Self-Similarity Matrix for the original track show the same patterns?


In these self-similarity matrices, we see which sections of the original Oingo Boingo version of Dead Man’s Party are similar in terms of chroma and in terms of timbre. As with the Vitamin String Quartet cover, there are several diagonal lines clearly visible in the matrices.

The chroma matrix shows diagonal lines at roughly the same amounts of seconds as the string quartet version, this probably means that where melodies repeat, this is visible in both versions, as they use the same melodies. In the string quartet version however, there are more diagonal lines close to each other. This could be because the four instruments in the string quartet version repeat after each other more than in the original.

In the timbre matrix, we see that it is a lot more clear than the Vitamin String Quartet version. The diagonal lines are clearer, but the checkerboard pattern is less distinct. This is probably due to the instrumentation. The string quartet version, of course, only uses string instruments in the same family. This makes it harder for the timbre matrix to identify different distinctive qualities of certain instrument signs, as these are all in the same family. In this matrix of the original version, instruments from different families were used to the differences are easier to see.

Is the key of a piece changed when it is rewritten for a string quartet?


Musical key describes the scale on which a song is based, meaning that most of the notes in a song will come from the scale of that key.In this bar plot, we can see that for the different keys, the number of songs in the playlist of Vitamin String Quartet covers (Vitamin for CompMusic) is not the same as the number of originals (Originals for CompMusic) in that same key. As the Vitamin String Quartet playlist only contains the covers of the songs on the Originals playlist, and no others, this means that the keys of several songs have to have been changed when transitioning to their string quartet version.

How does the tempo change when adapting a piece for a string quartet?


As I was examining my corpus’s tempo, I saw that there are no clear outliers, and both the Vitamin String Quartet Covers playlist and the Originals Playlist stayed within roughly the same range. Upon closer look, however, I found that the highest and lowest tempo songs did not match across playlist. For example, the highest tempo song in the string quartet versions was ‘Little Black Submarines’. This one was not particularly high ranking in the Originals. This is why I thought it would be interesting to compare tempograms of these two versions.

In the original ‘Little Black Submarines’ by The Black Keys, you can see that there is a very obvious sound change at around 130 seconds. There is a short pause and then a distorted guitar. This is clearly visible in the tempogram, where you can see it is having difficulty determining the tempo there, and it therefore looks like a vague gap in the piece. After this first change, the song goes back to the melody it had before, but with different instrumentation and it sounds like the tempo is slightly higher. This change is also visible in the tempogram, where you can see the lines shift so that they do not align perfectly with the part before 130 seconds.

According to Spotify, the tempo of the Vitamin String Quartet version of ‘Little Black Submarines’ is a little over double the BPM of the original. When listening to both, I can imagine it, as the string quartet version has strings playing in the background that make it sounds faster than the very basic instrumentation of this part of the original. Something that is visible in both tempograms is the change when the song turns to more of a rock song. The tempogram for Vitamin String Quartet also shows this sudden change at around 130 seconds.

Can a classifier determine the difference between string quartet covers and their original tracks?


In this confusion matrix, we look at a classifier’s predictions of which song will fit which playlist, compared to which playlist it is actually in. For this matrix, I fed the classifier both my Vitamin String Quartet covers playlist and my playlist of the original versions in question. The Spotify API features used to train the classifier are danceability, energy, valence, loudness, speechiness, acousticness, instrumentalness, liveness, tempo, duration and key.

I experimented with leaving particular audio features out. For some cases, valence for example, this did lead to more accuracy, but it would not be representative to leave important features out.

For the Originals playlist, the precision was 0.750, while the recall was 0.9. For the Vitamin String Quartet playlist the precision was 0.875, while the recall was 0.7. While a human would immediately be able to tell whether or not the song was only played by string instruments, the classifier still did a decent job.

Conclusions: What differences have we seen?

On various levels we have seen that while the string quartet versions are covers, and keep the main melodies of a song, they differ greatly.

For one, the string quartet versions are much lower energy. This could be due to the different instrumentation, which Spotify registers as being less energetic. They are also less loud, which makes some sense when you think about some of the instruments used in the originals that are not part of a string quartet.

We’ve also seen that despite the fact that the covers mimic the melodies in the original, they can change the pitches significantly. In Wonderwall, for example, it seems like the entire cover version is lower pitched than the original. This also means that you cannot compare the covers to their originals with Dynamic Time Warping, as they simply do not align.

Through self-similarity matrices, we have seen that the timbre of the string quartet versions makes a huge difference for how we detect different qualities of the piece. The string quartet’s composition is made up of instruments of the same family, so it is hard to determine different timbres, while the orginals use a varying amount of instruments and instrument families.

We’ve also seen that the key of a piece has been changed for several songs. I am not sure why this is, but it is an interesting idea. Most of the string quartet covers are very easy to recognize as the original, despite it being in a different key.

The tempo of a song generally translates to the string quartet version fairly evenly. Even if the overall tempo of a song is not exactly the same, any changes within the original will also be visible changes in the cover.

Last but not least, with all these differences, you would think a classifier can easily determine which piece is the cover and which is the original. A human ear would immediately identify it from the instrumentation (and in most cases, lack of vocals). The classifier does indeed do a pretty good job, but it is not perfect. In this sense, the human ear is better at identifying what is what than the computer.

What I have learned from this, is that it is sometimes easy to see that Vitamin String Quartet mimics the originals in a way that makes it very easy for us to recognize it as the original, even if the qualities of the piece are heavily altered. The main conclusion I draw from this is that in this case, humans detect the differences more easily than a computer using the Spotify API. This could be interesting for future improvements to mainly classifiers, and perhaps the Spotify API itself.